Skip to content

Comments

Add long-term memory retrieval powered by qmd (BM25 and VSearch).#34

Open
mczabca-boop wants to merge 6 commits intomainfrom
feat/qmd-memory-retrieval-pr-ready
Open

Add long-term memory retrieval powered by qmd (BM25 and VSearch).#34
mczabca-boop wants to merge 6 commits intomainfrom
feat/qmd-memory-retrieval-pr-ready

Conversation

@mczabca-boop
Copy link
Collaborator

@mczabca-boop mczabca-boop commented Feb 13, 2026

PR Title

Improve QMD Memory Retrieval Reliability, Observability, and Claude Injection Safety

Summary

This PR hardens TinyClaw’s memory pipeline end-to-end and removes several reliability pitfalls found during real Telegram regression testing.

It keeps memory retrieval QMD-centric (BM25 + VSearch), improves retrieval quality and debug visibility, and fixes the critical issue where memory snippets were retrieved but not reliably consumed by Claude.

Why

Memory behavior was inconsistent in production-like tests due to:

  • semantic hits returning noisy/self-referential snippets
  • stale/low-confidence snippets polluting final answers
  • path normalization mismatches preventing turn hydration/reranking
  • memory being injected into a file Claude CLI did not reliably consume
  • potential cleanup overwrite risk when restoring CLAUDE.md
  • high VSearch overhead from synchronous embed timing

This PR addresses those issues while preserving optional memory and safe defaults.

Key Changes

1. QMD retrieval flow hardening

  • Retained QMD-only retrieval path:
    • BM25 (qmd search)
    • VSearch (qmd vsearch)
  • Added stronger fallback behavior:
    • if VSearch results are filtered to unusable snippets, fallback to BM25
  • Reused lexical query variants across precheck/main BM25 flow to avoid redundant recomputation.

2. Reranking and snippet quality improvements

  • Moved rerank heuristics into a dedicated module:
    • src/lib/memory-rerank.ts
  • Added configurable rerank settings under memory.rerank.
  • Added stronger low-confidence filtering:
    • low-confidence assistant snippets are now filtered out from injection (not only downranked).
  • Added rerank debug output to inspect top selected snippets in logs.

3. Turn hydration bug fixes

  • Fixed source-to-turn-file resolution robustness:
    • handles qmd source normalization differences (case/punctuation variants like _ vs -)
  • Standardized newly persisted turn filenames to lowercase to reduce future mismatch risk.
  • Result: rerank/hydration can reliably use full User/Assistant turn content instead of raw patch-like snippets.

4. Claude memory injection redesign

  • Replaced runtime injection target:
    • from .claude/MEMORY.md
    • to .claude/CLAUDE.md runtime section
  • Added safer cleanup strategy:
    • inject with unique start/end markers
    • cleanup removes only marker-bounded runtime block
    • avoids full-file rollback overwrite risk when file changes during invocation
  • Retained defensive cleanup for legacy MEMORY.md.

5. Embed/update behavior and latency improvements

  • Increased default embed interval to reduce runtime overhead:
    • default embed_interval_seconds: 600
  • Changed embed trigger to asynchronous fire-and-forget (non-blocking query path).
  • Added in-flight guard per collection to avoid duplicate embed runs.

6. Logging and observability

  • Added/kept clearer memory-source logs:
    • qmd-bm25 / qmd-vsearch
  • Added injection-path logs.
  • Added optional debug logging for:
    • mode, timeout, query-used, rerank summary, fallback behavior.
  • Improved warning behavior to be agent-scoped instead of process-global for qmd-unavailable cases.

7. Configuration/setup/docs updates

  • Setup wizard wording keeps VSearch explicitly marked as experimental:
    • Use semantic search (vector, experimental)? [y/N]
  • Updated setup defaults and README to reflect:
    • safer semantic behavior
    • async embed behavior
    • longer embed interval
    • retention/rerank controls
    • memory source observability.

Validation Performed

  • npm run build:main passed after changes.
  • bash -n lib/setup-wizard.sh passed.
  • End-to-end Telegram regressions validated:
    • correct memory source logs (qmd-vsearch, qmd-bm25 where applicable)
    • runtime injection log now shows .claude/CLAUDE.md (runtime section)
    • previously failing case (Who likes rock?) now correctly answers from retrieved memory after injection fix
    • repeated reset does not wipe persisted QMD memory (session reset behavior preserved).

Backward Compatibility

  • Memory remains optional.
  • TinyClaw still runs without qmd/bun; memory retrieval degrades gracefully when unavailable.
  • Existing deployments continue to work; behavior is now more explicit and debuggable.

Notes / Reviewer Focus

Please focus review on:

  • src/lib/memory.ts retrieval flow + fallback + hydration/rerank
  • src/lib/memory-rerank.ts configuration-driven heuristics
  • src/lib/invoke.ts marker-based CLAUDE.md runtime injection/cleanup safety
  • setup/README defaults and wording around experimental semantic search.

@mczabca-boop mczabca-boop changed the title feat(memory): add optional qmd retrieval with safe defaults ## Summary Add optional long-term memory retrieval powered by qmd, with safe defaults and documentation updates. ## Changes - add src/lib/memory.ts for memory retrieval integration - wire memory retrieval into invoke/queue flow - extend config types for memory/qmd options - document qmd installation and Linux/WSL dependency in README - add troubleshooting notes for memory/qmd behavior ## Config Notes - memory remains opt-in (memory.enabled: false by default) - qmd can be enabled independently under memory.qmd - supports configurable top_k, min_score, max_chars, and update interval ## Validation - npm run build passes - manual Telegram flow tested for: - memory disabled: no retrieval injection - memory enabled: retrieval hit appears in logs ## Risk / Compatibility - no breaking change for existing users - users without qmd are unaffected when memory is disabled Feb 13, 2026
@mczabca-boop mczabca-boop changed the title ## Summary Add optional long-term memory retrieval powered by qmd, with safe defaults and documentation updates. ## Changes - add src/lib/memory.ts for memory retrieval integration - wire memory retrieval into invoke/queue flow - extend config types for memory/qmd options - document qmd installation and Linux/WSL dependency in README - add troubleshooting notes for memory/qmd behavior ## Config Notes - memory remains opt-in (memory.enabled: false by default) - qmd can be enabled independently under memory.qmd - supports configurable top_k, min_score, max_chars, and update interval ## Validation - npm run build passes - manual Telegram flow tested for: - memory disabled: no retrieval injection - memory enabled: retrieval hit appears in logs ## Risk / Compatibility - no breaking change for existing users - users without qmd are unaffected when memory is disabled Add optional long-term memory retrieval powered by qmd, with safe defaults and documentation updates. Feb 13, 2026
@mczabca-boop mczabca-boop force-pushed the feat/qmd-memory-retrieval-pr-ready branch from a9ae0ca to 5b0f7ce Compare February 15, 2026 06:13
* fix(whatsapp): fail fast when Puppeteer Chrome is missing

* fix(whatsapp): validate Puppeteer executable path exists
@mczabca-boop mczabca-boop changed the title Add optional long-term memory retrieval powered by qmd, with safe defaults and documentation updates. Add long-term memory retrieval powered by qmd (BM25 and VSearch). Feb 18, 2026
@mczabca-boop mczabca-boop requested a review from jlia0 February 18, 2026 05:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

1 participant